Molecular detection and phylogenetic analysis of pigeon circovirus from racing pigeons in Northern China

Background Pigeon circovirus (PiCV) infections in pigeons (Columba livia) have been reported worldwide. Currently, pigeon racing is becoming increasingly popular and considered to be a national sport in China, and even, the greatest competitions of racing pigeons are taking place in China. However, there are still no epidemiologic data regarding PiCV infections among racing pigeons in China. The purpose of our study was to provide information of prevalence, genetic variation and evolution of PiCV from racing pigeons in China. Results To trace the prevalence, genetic variation and evolution of PiCV in sick and healthy racing pigeons, 622 samples were collected from 11 provinces or municipalities in China from 2016 to 2019. The results showed that the positive rate of PiCV was 19.3% (120/622) at the sample level and 59.0% (23/39) at the club level, thus suggesting that the virus was prevalent in Chinese racing pigeons. A sequence analysis revealed that the cap genes of the PiCV strains identified in our study displayed a high genetic diversity and shared nucleotide homologies of 71.9%–100% and amino acid homologies of 71.7%–100%. 28 and 36 unique amino acid substitutions were observed in the Cap and Rep proteins derived from our PiCV strains, respectively. A cladogram representation of PiCV strains phylogeny based on 90 cap gene sequences showed that the strains in this study could be further divided into seven clades (A, B, C, E, G, H, and I) and some of them were closely related to worldwide strains from different types of pigeons. A large number of recombination events (31 events) were also detected in the PiCV genomes from Chinese racing pigeons. Conclusions These findings indicate that PiCV strains circulating in China exhibit a high genetic diversity and also contribute to information of prevalence, genetic variation and evolution of PiCV from racing pigeons in China. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08425-8.

(Cap) [2,3]. The gene forming ORF-C1 of PiCV has been demonstrated to be highly genetically diverse as compared with the gene forming ORF-V1 [1,4]. The other ORFs including ORF-C2, ORF-C3, and ORF-C4 located on the complementary sense strand encode three viral proteins of unknown functions [2,5]. The circovirus infection in pigeons was first identified in 1993 in the USA [6] and had been considered to be strongly associated with young pigeon disease syndrome (YPDS), including weakened racing performance, weight loss, lethargy, anorexia, respiratory distress, and diarrhea [7].
In China, the first PiCV infection was detected from meat pigeons in Zhejiang province in 2009, and the full genome was sequenced [31]. In recent years, several studies have proved that PiCV was prevalent among meat pigeons in eastern and southern China [4,14]. Recently, the competitions of racing pigeons are becoming increasingly popular in China, and even, pigeon racing is considered to be a national sport. The Chinese Association of Racing Pigeon Breeders has over 5 million members. Chinese breeders compete in numerous races for different distances in their sections throughout the racing season, and more than 25 million racing pigeons from approximately 750 racing clubs were selected to take part in the competitions every year. However, there are no epidemiologic data on PiCV infections among racing pigeons. In order to investigate the prevalence, evolution, and genome characterization of PiCV in racing pigeons in China, an extensive epidemiological investigation and bioinformatic analysis of PiCV from racing pigeons were undertaken in this study. The purpose of this study is to provide novel epidemiologic data and genome characterization for PiCV strains identified from racing pigeons in China.

PCR detection and prevalence analysis of PiCV infection in Chinese racing pigeons
In this study, PiCV was detected in samples from both sick and healthy racing pigeons. The results indicated that the positive rates of PiCV were 19.3% (120/622) at the sample level. Among the samples, 14 out of 51 sick pigeons (positive rate, 27.5%) and 106 out of 571 healthy pigeons (positive rate, 18.6%) were identified to be PiCV positive, respectively. In this study, 23 out of 39 racing pigeon clubs were positive for PiCV (positive rate, 59.0%) ( Table 1). Detailed information for 120 PiCV positive samples from racing pigeons in China is shown in Table 1.

Genome characterization of PiCV genomes
A total of sixty-seven full genome sequences were obtained and deposited in GenBank with accession numbers: MW181925 to MW181991 (Table S2). The assembled circular whole genome sequences were detected as eleven sizes ranging from 2030 to 2045 nt with their cap gene ranging from 813 to 828 nt in length and rep gene of 948 or 954 nt as shown in Table S2. Among these variable genome length, the most common genome sizes were 2037 (n = 21) and 2042 (n = 13) nucleotides, respectively (Table S2). The sequence comparison among the 67 identified PiCV strains revealed nucleotide homologies of 84.2%-100% and these sequences exhibited nucleotide homologies of 83.0%-97.8% and 82.0%-98.3% as compared with Chinese PiCV reference strains and the ones from the other countries (Table S3).

Analysis of nucleotide sequence of the cap genes and amino acid sequence of the Cap proteins
To explore the genetic diversity of PiCV strains, the cap genes of 90 PiCV strains from the 120 PiCV positive samples were sequenced. Detailed information for the 90 cap genes was shown in Table S2. The 90 cap genes were used to compare with the reference sequences from China and other countries, respectively. The results showed that the 90 cap genes ranged from 813 to 828 nt in length. Fourteen, thirteen, one, fifty-three, seven and two of the 90 cap nucleotide sequences were 813 nt, 816 nt, 819 nt, 822 nt, 825 nt, and 828 nt in length, encoding a Cap protein of 270, 271, 272, 273, 274, and 275 residues, respectively (Table S2). In addition, we found that ATT and GTG also existed in the position of the start codon site. The sequence comparison of the 90 identified cap genes revealed nucleotide homologies of 71.9%-100% and deduced amino acid homologies of 71.7%-100%, and the cap genes exhibited low sequence similarities with Chinese PiCV reference strains (73.0%-99.6% nucleotide identity, 72.3%-100% amino acid identity) and the PiCV reference strains from other countries (68.8%-98.4% nucleotide identity, 63.6%-100% amino acid identity) (Table S3).
To investigate variations in the deduced amino acid sequences, the amino acid sequences of 90 identified Cap proteins and the reference strains were aligned. The results showed that there were nine major locations of deletion (compared to the consensus sequence) among the Cap proteins including locations 7, 24, 29, 30, 35, 58, 130, 182, and 266 ( Fig. 1). A comparison of entropy (Hx) in amino acid sequences of Cap proteins showed that the Hx of amino acid sequences at most positions from the 90 identified Cap proteins were higher than that of the PiCV reference strains (Fig. 2). In addition, some unique amino acid substitutions at 28 different positions were observed among the 90 Cap proteins as shown in Fig. 3.

Analysis of nucleotide sequence of the rep genes and amino acid sequence of the Rep proteins
The rep genes for 68 out of 120 PiCV positive samples were successfully sequenced. Detailed information was  Table S2. The 68 identified rep genes were used to compare with PiCV reference sequences from China and other countries. The results showed that all of the 68 rep genes used the ATG start codon and were identified as two sizes: 948 and 954 nt. Forty of the 68 rep nucleotide sequences were 948 nt in length, encoding a Rep protein with 315 residues. Twenty-eight of the 68 rep nucleotide sequences were 954 nt in size, encoding a Rep protein with 317 residues (Table S2). The difference in size was due to a 2 amino acid deletion at positions 2 and 3. The sequence alignment revealed nucleotide homologies of 90.3%-100% and deduced amino acid homologies of 92.7%-100% among the 68 rep genes. These sequences exhibited higher similarities with PiCV reference strains from China (89.0%-99.2% nucleotide identity, 89.2%-99.6% amino acid identity) and other countries (89.5%-98.3% nucleotide identity, 90.5%-99.3% amino acid identity) as compared to the cap genes (Table S3). To investigate variations in the deduced amino acid sequences of rep gene products, the amino acid sequences of 68 identified rep genes and the reference strains were aligned. The results showed that some unique amino acid substitutions at 36 different positions were observed among the 68 identified PiCV strains (Data not shown).

Phylogenetic analysis of the cap genes
A phylogenetic analysis using 126 cap gene reported in GenBank, and cap genes of 90 PiCV strains obtained in our study was performed to investigate genetic relationship (Table S1 and S2). The results indicated that a total of 216 analyzed PiCV strains could be arbitrarily divided into 9 clades, which were defined by the contractual letters A-I (Fig. 4)

Recombination analysis in PiCV genomes
For 67 identified PiCV strains, 31 recombination events were detected using RDP4 [32] (Table S4). These recombination events indicated that the strains originally from different types of pigeons or from different geographical locations may have some evidence of recombination. For example, a segment (~ 1001 bp) of three PiCV strains (LT3/SN/2018/MW181940, LT2/SN/2018/MW181939, and WQ6/SN/2018/MW181948) obtained in this study was likely descended from a large number of ancestral PiCV genomes that originated from Chinese meat pigeons, German racing pigeons, British racing pigeons, Polish fancy pigeons, and French meat pigeons (event 2). The possible breakpoints for recombination were determined during the recombination analysis. The results also indicated that the recombination breakpoint hot plots located within both the intergenic region between  Table S1 and S2

Discussion
In recent studies, Rotavirus A G18P [17] had been confirmed as a primary cause of YPDS-like diseases in domestic pigeons [33]. Furthermore, PiCV is also supposed to be an etiological agent of YPDS mostly affecting young pigeons worldwide [7]. PiCV has been reported to be prevalent in at least sixteen countries. In China, PiCV was first detected in meat pigeons in 2009 [31]. In recent years, the epidemiological survey showed that the  [4,14], implying that PiCV is widely distributed among meat pigeon populations in China. However, the epidemiology and distribution of PiCV in the racing pigeons is unknown. The main objective of this work was to evaluate the genetic diversity and epidemiology of PiCV strains circulated in the racing pigeons of China. Positive samples were detected from seven provinces and the prevalence rates among these provinces were variant. Overall, our data implied that PiCV was also widely distributed in diseased racing pigeons and healthy racing pigeons in China for the first time.
Previous studies showed that the length of PiCV genome was 2031-2043 nt [14,28,29]. However, our findings first showed that the 2030 nt (n = 1), 2044 nt (n = 2), and 2045 nt (n = 1) complete genome were present in the PiCV positive samples. Despite many genetic diversities were found in the genome of PiCV strains, no mutation was found in the conserved nonanucleotide motif (TAG TAT TAC) located at the apex of a potential stem-loop which was putatively associated with the initiation of rolling circle replication (RCR) [2,3]. Previous studies have shown many genetic diversities in the cap gene [14,23,34]. Our findings showed that the 90 identified cap genes exhibited a higher diversity as compared with the reference strains. Some unique amino acid substitutions at 28 different positions were observed among the Cap proteins. In addition, the cap nucleotide sequence with 828 nt in length encoding a novel Cap protein of 275 amino acids was first identified in two Chinese PiCV strains, TY3/SN/2016/MW181931 (clade E) and WL4/SN/2018/MW181959 (clade E). Higher diversity of Cap protein versus the Rep protein is due to the fact that the Cap protein, as the protein shell of the virus, is exposed to the host's immune cells, resulting in a stimulation of a cascade of immune responses. Positive selection occurs during this process. Some mutations can change the structure of Cap protein. This may lead to a better binding of virus with the receptors of target cells, which increases the infection ability of virus in cells. Moreover, these mutations may also protect virus from being neutralized by antibodies [1,35,36]. In this study, we identified many mutations in both N-and C-terminus of the Cap protein. According to the investigations on Cap protein of BFDV, the arginine-rich N-terminus has two functions (nuclear localization and nucleic acid binding) that enable as penetration of the viral genome into the host cell nucleus through nuclear pore complex [37]. Like BFDV, the N-terminus of PiCV Cap protein is also rich in arginine and has been predicted to be a nuclear localization signal and a nucleic acid binding domain by software [38], which means that this region is highly possible to be functionally similar to the corresponding region of the Cap protein of BFDV. Moreover, the results of this study showed that most of the amino acid deletion sites in Cap protein were concentrated in the N-terminus and many novel mutations also have been identified in this region. However, due to the lack of relevant experimental evidence, we cannot infer whether these mutations and deletions could affect the function of Cap protein.
The Cap protein has been used as a coating antigen in ELISA due to its antigenic activity and can be recognized by PiCV specific antibodies [39,40]. If the N-terminal region of the PiCV Cap protein, which is located within the capsid, resembles the corresponding region in BFDV, the N-terminal amino acid mutation and deletion may not affect the ability to bind to the antibody as a coating antigen [39]. In addition, there are also many mutations in the C-terminus where most sites are composed of hydrophilic amino acids. It means that this region may locate on the outside of the protein and may have ability for binding with antibody. Therefore, mutations of this region may lead to changes in antigenicity. Cap protein, which contains neutralizing antibody epitopes, can be considered as a potential antigen candidate in sub-unit vaccine development [41,42]. Moreover, the sub-unit vaccines based on PCV2 recombinant capsid proteins are successfully used in the prevention of PCV2-SD [43][44][45][46]. It means that the development of a sub-unit vaccine may protect pigeons from infection with PiCV [42]. Due to a high genetic diversity of Cap proteins may lead to differences in antigenicity between different strains, sub-unit vaccines may also be diversified in future. To sum up, since there is little research on the function of PiCV Cap protein, the specific meaning of these mutations cannot be accurately explained. Therefore, further experiments are needed to explain if these mutations and deletions affect the function of Cap protein. According to the obtained nucleotide sequences of cap gene, besides ATG and ATA [2,5,14], we found that ATT and GTG also existed in the position of the start codon site through sequence alignment. Moreover, a similar situation has been found in GuCV [47] and BFDV [48]. Since no available cell lines have been found to culture PiCV in vitro [49,50] and lack of commercial antibodies against PiCV Cap protein, it is hard to verify that indeed a full-length natural Cap protein is translated from the described alternative start codons. In all 68 identified Rep proteins, three amino acid motifs named FTLNNP (position [41][42][43][44][45][46], HLQGF (position 78-82), and YCSK (position 116-119) that putatively associated with RCR [2] were completely conserved. In addition, a fourth motif, which was putatively associated with dNTPase activity, namely GKS (position 197-199) [2], was also completely conserved. These sequence characteristics help understand the genetic diversity of PiCV. However, since the PCR amplification strongly relies on nucleotide homology in the regions where primers anneal, the samples containing more divergent viruses may not be obtained the full genome sequence when amplifying the viral genome. Thus, the actual diversity of circoviruses in Chinese pigeons may be even greater. This can be addressed by future studies employing next generation sequencing methods. Besides the pathogenicity of different PiCV strains still need to be validated in future.
In the present study, the 90 PiCV strains were divided into seven clades (A, B, C, E, G, H, and I) based on a phylogenetic analysis of cap gene. Additionally, we found that PiCV strains isolated from the same club belonged to different clades and shared a low sequence identity. These data suggested that the infection and evolution of PiCV in Chinese racing pigeons might have different evolutionary origins. This may result from the fact that the import of racing pigeons from all around the world to China is very significant and a lack of oversight in the international pigeon trade (racing pigeons mainly) [20,51]. In addition, another possible explanation is that the intensive trafficking of pigeons between clubs and racing events in China. Interestingly, two isolates, QYQX1/HE/2018/MW181909 (clade A) and QYQX2/HE/2018/MW181910 (clade C), with lower identity (75.5% nucleotide identity and 76.8% amino acid identity) in the cap gene were detected from the same pigeon, suggesting a horizontal transmission occurred among the racing pigeons in the same racing clubs [6,52]. The complexity of epidemic PiCV strains in China may cause difficulties to protect pigeon from infection by vaccines in future.
Viral recombination had been proved to play a significant role in the evolution of many ssDNA viruses [53]. The extensive recombination events had been reported in PiCV genomes and other circoviruses, such as PCV [54,55] and BFDV [56,57]. In this study, a large number of recombination events (31 events) were detected in the 67 identified PiCV strains. The high intensity of recombination may suggest that frequent co-infections of different PiCV strains. Moreover, recently, Khalifeh A. et al. has demonstrated that recombination plays a role in PiCV evolution through animal experiment [58]. Thus, the recombination seemed to be a key mechanism for PiCV evolution [21,23,28,29].

Conclusions
In conclusion, our study demonstrated that PiCV infection in racing pigeons was widespread in northern China and revealed the characteristics of the PiCV genome. Furthermore, the identified PiCV strains displayed a high genetic diversity. Our data also demonstrated that PiCV in Chinese racing pigeons had an extensive recombination for the first time. The high genetic diversity of PiCV may be derived from two possible facts: 1) the racing pigeons in China are imported from all around the world; and 2) a lack of oversight in the international pigeon trade (mainly refer to racing pigeons). In addition, another possible explanation is that the intensive trafficking of pigeons between clubs and racing events in China. Furthermore, the high intensity of recombination may suggest that frequent co-infections of different PiCV strains. Overall, clubs and racing events could provide a suitable chance for viral recombinants, which may pose a threat for the domestic pigeons, and has a possibility of spilling over onto wild species.

Sample collection
In total, 622 samples were collected from 11 provinces or municipalities in China from November 2016 to August 2019. Of these samples, 571 serum samples were collected from 571 healthy racing pigeons from 39 racing pigeon clubs in 11 provinces or municipalities of China, including Beijing, Hebei, Liaoning, Shanxi, Gansu, Qinghai, Shandong, Shaanxi, Xinjiang Uygur Autonomous Region, Ningxia Hui Autonomous Region and Inner Mongolia Autonomous Region. Animals obtained from the racing pigeon clubs were returned to the club owners after collecting blood samples. Moreover, the corpses of 51 sick pigeons from 7 racing pigeon clubs in 5 provinces or municipalities were submitted to our laboratory at College of Veterinary Medicine, Northwest A&F University to determine the pathogen of the disease. The 0.2 g sections of internal organs (the heart, liver, spleen, lung, and intestine) were obtained during the postmortem examination of all mentioned dead pigeons. All samples were stored at − 80 °C before DNA extraction.

DNA extraction
The tissue samples were ground to powder with liquid nitrogen and diluted with three volumes of phosphatebuffered saline. The samples (1 ml) were centrifuged at 12,000 × g for 10 min at 4 °C and the supernatants were transferred into a 1.5 ml tube. Nucleic acids from tissues and serum samples were extracted by using an Easy-Pure ® Viral DNA/RNA Kit (TransGen Biotech, Beijing, China) according to the manufacturer's instructions. The extracted genomic DNA was stored at -20 °C before use.

Genome characterization and sequence analysis of the cap, rep genes and PiCV genome
The cap gene sequences, rep gene sequence and the full genome sequences of PiCV obtained in this study have been deposited in GenBank, respectively. The sequences obtained in the current study were compared with all available PiCV sequence data from different geographical locations within China (n = 60) and the rest of the world (n = 73) from the National Center for Biotechnology Information (NCBI) nucleotide database (Table S1). Multiple sequence alignments were carried out by ClustalW algorithm using MEGA 5.0 software [60] and the homology among nucleotide and amino acid sequences was determined by using BioEdit v. 7.0.5 software [61]. A divergence analysis of the Cap protein of PiCV was performed using WebLogo Tm (http:// weblo go. three pluso ne. com/), an online software for sequence logo generator [62]. A comparison of Hx in amino acid sequences of Cap proteins of PiCV was carried out by using GraphPad Prism 7.0 software (GraphPad Software, Inc., USA).

Phylogenetic analysis
The entire cap gene sequences of PiCV obtained in this study were used for a phylogenetic analysis. All available PiCV sequence data from different geographical locations within China (n = 54) and the rest of the world (n = 72) were retrieved from the NCBI nucleotide database as reference sequences (Table S1). Maximum likelihood phylogenetic trees were constructed with the GTR + G + I model by using the MEGA 5.0 software [60], with partial deletion to handle alignment gaps and 1,000 bootstrap iterations. The cladogram was generated and annotated with the Interactive Tree Of Life (iTOL) software (http:// itol. embl. de/) [63].

Recombination analysis
In order to analyze the potential recombinant evidence, an integrated software package for the recombination detection program RDP4 [32] was used to detect potential recombinant strains, parental strains, and possible recombination breakpoints. Seven methods (RDP [64], GENECONV [65], BootScan [66], MaxChi [67], Chimaera [68], SiScan [69], and 3Seq [70]) were implemented using the RDP4 program [32]. Recombination events were identified by at least three of the aforementioned methods, and the p-value < 0.05 were considered plausible recombinant events. The sequences in the analyzed dataset that most closely resembled the parental sequences of recombinants were defined as either "minor parents" or "major parents" based on the size of the genome fragments which had contributed to the detected recombinants (with the major parent contributing the larger fragment and the minor parent the smaller).